Development of the RWTH transcription system for slovenian
نویسندگان
چکیده
In this paper we describe the RWTH automatic speech recognition system for Slovenian developed within the transLectures project. The project aims at supporting the transcription and translation of video lectures freely available on the web. Difficulties arise on all levels of modeling: Slovenian is a morphologically rich language with a high level of inflection (pronunciation model), and a large variety of dialects and recording conditions brings uncertainty into the audio signal (acoustic model). Moreover, the video lectures cover a wide spectrum of topics with a high share of spontaneous speech and technical terms (language model). These issues require application of robust and adaptive methods. Besides the system description, this study mainly focuses on robust acoustic modeling. Building acoustic models from various resources, we also compare the influence of speaker adaptation to different neural network based acoustic features. Systematic application of these methods allows us to reduce the word error rate on the evaluation corpus from 59.2% to 43.4%. We also give a motivation for Slovenian open vocabulary recognition and perform some first steps.
منابع مشابه
The SI TEDx-UM speech database: a new Slovenian Spoken Language Resource
This paper presents a new Slovenian spoken language resource built from TEDx Talks. The speech database contains 242 talks in total duration of 54 hours. The annotation and transcription of acquired spoken material was generated automatically, applying acoustic segmentation and automatic speech recognition. The development and evaluation subset was also manually transcribed using the guidelines...
متن کاملt-Pancyclic Arcs in Tournaments
Let $T$ be a non-trivial tournament. An arc is emph{$t$-pancyclic} in $T$, if it is contained in a cycle of length $ell$ for every $tleq ell leq |V(T)|$. Let $p^t(T)$ denote the number of $t$-pancyclic arcs in $T$ and $h^t(T)$ the maximum number of $t$-pancyclic arcs contained in the same Hamiltonian cycle of $T$. Moon ({em J. Combin. Inform. System Sci.}, {bf 19} (1994), 207-214) showed that $...
متن کاملSINOD - Slovenian non-native speech database
This paper presents the SINOD database, which is the first Slovenian non-native speech database. It will be used to improve the performance of large vocabulary continuous speech recogniser for non-native speakers. The main quality impact is expected for acoustic models and recogniser’s vocabulary. The SINOD database is designed as supplement to the Slovenian BNSI Broadcast News database. The sa...
متن کاملDevelopment of Slovenian Broadcast News Speech Database
The paper reviews the development of a new Slovenian broadcast news speech database. The database consists of audio, video and annotation transcripts of about 34 hours of television daily news program captured from the public TV station RTVSLO. The paper addresses issues concerning transcription and annotation of the collected data, provides information on content analysis and basic statistics ...
متن کاملThe 2006 RWTH parliamentary speeches transcription system
In this work, investigations in the course of the developement of RWTH automatic speech recognition systems developed for the second TC-STAR evaluation campaign 2006 are presented. The systems were designed to transcribe parliamentary speeches taken from the European Parliament Plenary Sessions (EPPS) in European English and Spanish, as well as speeches from the Spanish Parliament. The RWTH sys...
متن کامل